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■ Abstract 

A neat 1972 result of Pohl asserts that [~3n/2] — 2 comparisons are 
sufficient, and also necessary in the worst case, for finding both the min- 
imum and the maximum of an n-element totally ordered set. The set is 
accessed via an oracle for pairwise comparisons. More recently, the prob- 
lem has been studied in the context of the Renyi-Ulam liar games, where 
the oracle may give up to k false answers. For large k, an upper bound 
due to Aigner shows that (k + 0(i/k))n comparisons suffice. We improve 
on this by providing an algorithm with at most (fc+ 1 + C)n + 0(fc 3 ) com- 
parisons for some constant C. The known lower bounds are of the form 
(fc + l + Cfe)n — D, for some constant D, where Co = 0.5, a = |§ = 0.71875, 
and c fc = fi(2" 5fe/4 ) as k -> oo. 
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1 Introduction 



We consider an n-element set X with an unknown total ordering <. The or- 
dering can be accessed via an oracle that, given two elements x, y € X, tells 
us whether x < y or x > y. It is easily seen that the minimum element of X 
can be found using n — 1 comparisons. This is optimal in the sense that n — 2 
comparisons are not enough to find the minimum clement in the worst case. 

One of the nice little surprises in computer science is that if we want to find 
both the minimum and the maximum, we can do significantly better than finding 
the minimum and the maximum separately. Pohl [B] proved that [~3n/2] — 2 is 
the optimal number of comparisons for this problem (n > 2). The algorithm 
first partitions the elements of X into pairs and makes a comparison in each 
pair. The minimum can then be found among the "losers" of these comparisons, 
while the maximum is found among the "winners." 

Here we consider the problem of determining both the minimum and the 
maximum in the case where the oracle is not completely reliable: it may some- 
times give a false answer, but only at most k times during the whole computa- 
tion, where k is a given parameter. 

We refer to this model as computation against k lies. Let us stress that 
we admit repeating the same query to the oracle several times, and each false 
answer counts as a lie. This seems to be the most sensible definition — if repeated 
queries were not allowed, or if the oracle could always give the wrong answer to 
a particular query, then the minimum cannot be determined. 

So, for example, if we repeat a given query 2k + 1 times, we always get 
the correct answer by majority vote. Thus, we can simulate any algorithm 
with a reliable oracle, asking every question 2k + 1 times, but for the problems 
considered here, this is not a very efficient way, as we will see. 

The problem of finding both the minimum and the maximum against k lies 
was investigated by Aigner pQ, who proved that (k + 0(\/k))n comparisons 
always suffice We improve on this as follows. 

Theorem 1. There is an algorithm that finds both the minimum and the max- 
imum among n elements against k lies using at most [k + 1 + C)n + 0(fc 3 ) 
comparisons, where C is a constant. 

Our proof yields the constant C reasonably small (below 10, say, at least if 
k is assumed to be sufficiently large), but we do not try to optimize it. 

Lower bounds. The best known lower bounds for the number of comparisons 
necessary to determine both the minimum and the maximum against k lies have 
the form (k + 1 + Ck)n — D, where D is a small constant and the ct are as follows: 

• Co = 0.5, and this is the best possible. This is the result of Pohl [8] for a 
truthful oracle mentioned above. 

1 Here and in the sequel, 0(.) and f2(.) hide only absolute constants, independent of both 
n and k. 
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• ci = || = 0.71875, and this is again tight. This follows from a recent 
work by Gerbner, Palvolgyi, Patkos, and Wiener [5] who determined the 
optimum number of comparisons for k — 1 up to a small additive constant: 
it lies between \^n~\ ~ ^ an< ^ \% n ~\ + 12- This proves a conjecture of 
Aigner [T]. 

• Cfe = 57(2 _5fc / 4 ) for all k, as was shown by Aigner pQ. 

The optimal constant ci = g| indicates that obtaining precise answers for 
k > 1 may be difficult. 

Related work. The problem of determining the minimum alone against k 
lies was resolved by Ravikumar, Ganesan, and Lakshmanan [9], who proved 
that finding the minimum against k lies can be performed by using at most 
(fc + l)n — 1 comparisons, and this is optimal in the worst case. 

The problem considered in this paper belongs to the area of searching prob- 
lems against lies and, in a wider context, it is an example of "computation 
in the presence of errors." This field has a rich history and beautiful results. 
A prototype problem, still far from completely solved, is the Renyi-Ulam liar 
game from the 1960s, where one wants to determine an unknown integer x be- 
tween 1 and n, an oracle provides comparisons of x with specified numbers, and 
it may give at most k false answers. We refer to the surveys by Pelc [7] and by 
Deppe [2] for more information. 

2 A simple algorithm 

Before proving Theorem [1] we explain a simpler algorithm, which illustrates the 
main ideas but yields a weaker bound. We begin with formulating a generic 
algorithm, with some steps left unspecified. Both the simple algorithm in this 
section and an improved algorithm in the next sections are instances of the 
generic algorithm. 



The generic algorithm 

1. For a suitable integer parameter s — s(k), we arbitrarily partition 
the considered n-element set X into n/s groups X\, . . . ,X n i s of size 
s each0 

2. In each group Xj, we find the minimum rrn and the maximum Af,-. 
The method for doing this is left unspecified in the generic algorithm. 

3. We find the minimum of {mi, . . . , m n / 8 } against k lies, and indepen- 
dently we find the maximum of {Mi, M2, ■ • • , M n / S } against k lies. 



"If n is not divisible by s, we can form an extra group smaller than s and treat it 
separately, say — we will not bore the reader with the details. 
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The correctness of the generic algorithm is clear, provided that Step is 
implemented correctly. Eventually, we set s := k in the simple and in the 
improved algorithm. However, we keep s as a separate parameter, because the 
choice s := k is in a sense accidental. 

In the simple algorithm we implement Step [5] as follows. 



Step [2] in the simple algorithm 

[2jl. (Sorting.) We sort the elements of Xi by an asymptotically opti- 
mal sorting algorithm, say mergesort, using O(slogs) comparisons, 
and ignoring the possibility of lies. Thus, we obtain an ordering 
X\, X2, ■ ■ ■ , x s of the elements of Xi such that if all queries during the 
sorting have been answered correctly, then x\ < X2 < ■ ■ ■ < x s . If 
there was at least one false answer, we make no assumptions, except 
that the sorting algorithm does not crash and outputs some ordering. 

[2j2. (Verifying the minimum and maximum.) For each j = 2, 3, . . . , s, we 
query the oracle k + 1 times with the pair Xj—i,Xj. If any of these 
queries returns the answer Xj-i > Xj, we restart: We go back to 
Step 12111 and repeat the computation for the group Xi from scratch. 
Otherwise, if all the answers are Xj-% < Xj, we proceed with the next 
step. 

[2j3. We set rrii :— X\ and Mi := x s . 



Lemma 1 (Correctness). The simple algorithm always correctly computes the 
minimum and the maximum against k lies. 

Proof. We note that once the processing of the group Xi in the above algorithm 
reaches Step 12131 then m, = x\ has to be the minimum. Indeed, for every other 
element Xj, j > 2, the oracle has answered k + 1 times that Xj > Xj—i, and 
hence Xj cannot be the minimum. Similarly, Mj has to be the maximum, and 
thus the algorithm is always correct. □ 

Actually, at Step 121 3 1 we can be sure that Xx,...,x s is the sorted order of 
Xi, but in the improved algorithm in the next section the situation will be more 
subtle. The next lemma shows, that the simple algorithm already provides an 
improvement of Aigner's bound of (k + 0(\fk))n. 

Lemma 2 (Complexity). The number of comparisons of the simple algorithm 
for s — k on an n-element set is (k + 0(logfc))n + 0(k 3 ). 

Proof. For processing the group Xi in Step[2j we need 0(slogs)+(fe+l)(s— 1) = 
k 2 + O(fclogfc) comparisons, provided that no restart is required. But since 
restarts may occur only if the the oracle lies at least once, and the total number 
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of lies is at most fc, there are no more than k restarts for all groups together. 
These restarts may account for at most fc(fc 2 + 0(fclogfc)) = 0(fc 3 ) comparisons. 
Thus, the total number of comparisons in Step[2]is -(fc 2 + 0(fc log k)) + 0(fc 3 ) = 
(fc + 0(logfc))n + 0(fc 3 ). 

As we mentioned in the introduction, the minimum (or maximum) of an 
n-element set against k lies can be found using (k + l)n — 1 comparisons, and 
so Step [3] needs no more than 2(k + l)(n/s) = O(n) comparisons. (We do not 
really need the optimal algorithm for finding the minimum; any 0((fc + l)n) 
algorithm would do.) The claimed bound on the total number of comparisons 
follows. □ 



3 The improved algorithm: Proof of Theorem [T] 

In order to certify that X\ is indeed the minimum of X iy we want that for every 
Xj , j 7^ 1, the oracle declares Xj larger than some other element fc + 1 times. (In 
the simple algorithm, these fe + 1 comparisons were all made with Xj-i, but any 
other smaller elements will do.) This in itself requires (k + l)(s — 1) queries per 
group, or (fc + l)(n — n/s) in total, which is already close to our target upper 
bound in TheoremQ] (we note that s has to be at least of order fc, for otherwise, 
Step [3] of the generic algorithm would be too costly). 

Similarly, every Xj, j ^ s, should be compared with smaller elements fc + 1 
times, which again needs (fc + l)(n— n/s) comparisons, so all but O(n) compar- 
isons in the whole algorithm should better be used for both of these purposes. 

In the simple algorithm, the comparisons used for sorting the groups in 
Step 12111 are, in this sense, wasted. The remedy is to use most of them also for 
verifying the minimum and maximum in Step 12121 For example, if the sorting 
algorithm has already made comparisons of xn with 23 larger elements, in the 
verification step it suffices to compare Xn with fc + 1 — 23 larger elements. 

One immediate problem with this appears if the sorting algorithm compares 
Xyi with some b > k + 1 larger elements, the extra b — (fc + 1) comparisons are 
wasted. However, for us, this will not be an issue, because we will have s = k, 
and thus each element can be compared to at most fc — 1 others (assuming, as 
we may, that the sorting algorithm does not repeat any comparison). 

Another problem is somewhat more subtle. In order to explain it, let us 
represent the comparisons made in the sorting algorithm by edges of an ordered 
graph. The vertices are 1, 2, . . . , s, representing the elements x\, . . . , x s of X,- L 
in sorted order, and the edges correspond to the comparisons made during the 
sorting, see the figure below on the left. 
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In the verification step, we need to make additional comparisons so that 
every Xj , j ^ 1 , has at least k + 1 comparisons with smaller elements and every 
x h j 7^ s 7 nas at least fc + 1 comparisons with larger elements. This corresponds 
to adding suitable extra edges in the graph, as in the right drawing above (where 
k = 2, and the added edges are drawn on the bottom side). 

As the picture illustrates, sometimes we cannot avoid comparing some ele- 
ment with more than k + 1 larger ones or k + 1 smaller ones (and thus some 
of the comparisons will be "half- wasted" ) . For example, no matter how we add 
the extra edges, the elements x\,X2,Xs together must participate in at least 3 
half-wasted comparisons. Indeed, X2 and X3 together require 6 comparisons to 
the left (i.e. with a smaller element). These comparisons can be "provided" only 
by x\ and x 2 , which together want only 6 comparisons to the right — but 3 of 
these comparisons to the right were already made with elements larger than X3 
(these are the arcs intersecting the dotted vertical line in the picture). 

The next lemma shows that this kind of argument is the only source of wasted 
comparisons. For an ordered multigraph H on the vertex set {1, 2, . . . , s} as 
above, let us define t(H), the thickness of H , as max{t(j) : j = 2, 3, . . . , s — 1}, 
where t(j) :— \{{a, b} € E(H) : a < j < b}\ is the number of edges going "over" 
the vertex j. 

Lemma 3. Let H be an undirected multigraph without loops on {1,2, ... ,s} 
such that for every vertex j = 1, 2, . . . , s, 



Then H can be extended to a multigraph H by adding edges, so that 

(i) every vertex j 7^ 1 has at least k + 1 left neighbors and every vertex j ^ s 
has at least k + 1 right neighbors; and 

(ii) the total number of edges in H is at most (k + l)(s — 1) + t(H). 

The proof is a network flow argument and therefore constructive. We post- 
pone it to the end of this section. 

For a comparison-based sorting algorithm A, we define the thickness t^(s) 
in the natural way: It is the maximum, over all s-element input sequences, of 
the thickness t(H) of the corresponding ordered graph H (the vertices of H are 
ordered as in the output of the algorithm and each comparison contributes to 
an edge between its corresponding vertices). As the above lemma shows, the 
number of comparisons used for the sorting but not for the verification can be 
bounded by the thickness of the sorting algorithm. 

Lemma 4. There exists a (deterministic) sorting algorithm A with thickness 



Proof. The algorithm is based on Quicksort, but in order to control the thick- 
ness, we want to partition the elements into two groups of equal size in each 
recursive step. 



«) :=\{{i,j}GE(H) 
4 Sht (i) :=\{{iJ}€E{H) 



i< j}\ < k + 1, 
i > j}\ < k + 1 . 



t A (s) = O(s). 
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We thus begin with computing the median of the given elements. This can be 
done using O(s) comparisons (see, e.g., Knuth [6]; the current best deterministic 
algorithm due to Dor and Zwick [3] uses no more than 2.95s+o(s) comparisons). 
These algorithms also divide the remaining elements into two groups, those 
smaller than the median and those larger than the median. To obtain a sorting 
algorithm, we simply recurse on each of these groups. 

The thickness of this algorithm obeys the recursion t^(s) < 0(s)+tj^([s/2\ )), 
and thus it is bounded by O(s). □ 

We are going to use the algorithm A from the lemma in the setting where 
some of the answers of the oracle may be wrong. Then the median selection 
algorithm is not guaranteed to partition the current set into two groups of the 
same size and it is not sure that the running time does not change. However, 
we can check if the groups have the right size and if the running time does not 
increase too much. If some test goes wrong, we restart the computation (similar 
to the simple algorithm). 

Now we can describe the improved algorithm, again by specifying Step [2] of 
the generic algorithm. 



Step [2] in the improved algorithm 

HJl'. (Sorting.) We sort the elements of Xj by the algorithm A with thick- 
ness O(s) as in Lemma|U If an inconsistency is detected (as discussed 
above), we restart the computation for the group Xj from scratch. 

[2] 2'. (Verifying the minimum and maximum.) We create the ordered graph 
H corresponding to the comparisons made by A, and we extend it to 
a multigraph H according to Lemma |3] We perform the comparisons 
corresponding to the added edges. If we encounter an inconsistency, 
then we restart: We go back to Step [21 ll and repeat the computation 
for the group Xi from scratch. Otherwise, we proceed with the next 
step. 

[2]3'. We set to, := x\ and := x s . 



Proof of Theorem [7J The correctness of the improved algorithm follows in the 
same way as for the simple algorithm. In Stcp l2l2 / 1 the oracle has declared every 
element Xj, j ^ 1, larger than some other element k + 1 times, and so Xj cannot 
be the minimum. A similar argument applies for the maximum. 

It remains to bound the number of comparisons. From the discussion above, 
the number of comparisons is at most ((k+ l)(s — 1) +tA(s))(j + k) + 2(k+ 1) j, 
with t^(s) = O(s). For s — fc, we thus get that the number of comparisons at 
most (k + 1 + C)n + 0(fc 3 ) for some constant C, as claimed. □ 
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Proof of Lemma[3[ We will proceed in two steps. First, we construct a multi- 
graph H* from H by adding a maximum number of (multi)edges such that the 
left and right degree of every vertex are still bounded above by k + 1. Second, 
we extend H* to H by adding an appropriate number of edges to each vertex 
so that condition (i) holds. 

For an ordered multigraph W on {1,2, ... ,s} with left and right degrees 
upper bounded by k + 1, let us define the defect A(H') as 



We have A(fT) = 2(fc + l)(s - 1) - 2e(iT), where e(H') is the number of edges 
of H'. 

By a network flow argument, we will show that by adding suitable m* := 
(k + l)(s — 1) — e(H) — t(H) edges to H, one can obtain a multigraph H* 
in which all left and right degrees are still bounded by k + 1 and such that 
A(H*) = 2t(H). The desired graph H as in the lemma will then be obtained 
by adding A(iJ*) more edges: For example, for every vertex j > 2 of H* with 
< *ti*(J) < k + 1, we add fc + 1 — d^(j) edges connecting j to 1, and similarly 
we fix the deficient right degrees by adding edges going to the vertex s. 

It remains to construct H* as above. To this end, we define an auxiliary 
directed graph G, where each directed edge e is also assigned an integral capacity 



c(e); see Figure 1(a) 




k + l-d 1 



sht 



(0 



1 

k + 1 -d leH (i) 




(a) The graph G with capacities. 



(b) The cut S 2 in G. 



Figure 1: The directed graph G constructed in the proof of Lemma [3] 

The vertex set of G consists of a vertex j~ for every j £ {1, 2, . . . , s}, a 
vertex j + for every j g {1, 2, . . . , s}, and two special vertices a and b. There is 
a directed edge in G from a to every vertex j + and the capacity of this edge is 
k + 1 — d r ^ ht (j). Similarly, there is a directed edge in G from every vertex j~ 
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to b, and the capacity of this edge is k + 1 — d}^ lt (j). Moreover, for every i,j 
with 1 < i < j < s, we put the directed edge j~) in G, and the capacity of 
this edge is oo (i.e., a sufficiently large number). 

We will check that there is an integral a-b flow in G with value to* in G. By 
the max- flow min-cut theorem [3], it suffices to show that every a-b cut in G 
has capacity at least to* and there is an a-b cut in G with capacity to*. 

Let 5 1 C V(G) be a minimum a-6 cut. Let i be the smallest integer such that 
i + £ S. Since the minimum cut cannot use an edge of unbounded capacity, we 
have j~ € S for all j > i. 

We may assume without loss of generality that j + G S for all j > i and j~ £ 
S for all j < % (the capacity of the cut does not decrease by doing otherwise) . 
Therefore it suffices to consider a-b cuts of the form 

Si := {a} U {x + : x > i} U {x~ : x > i} 

for i = 1, . . . , s. The capacity of Si, see Figure [T(b)| equals 

2 C (o, j + ) + £ ccr, 6) = (« - i)(fe + 1) - E 4 ght (j) - E d H st U) ■ 

j<i j>i j<i j>i 

Now let us look at the quantity J2 3 <i d nght (j) + J2j>i d leh (j), and see how 
much an edge {j,j'} (j < j 1 ) of H contributes to it: For j < i < j', the 
contribution is 2, while all other edges contribute 1. Hence the capacity of the 
cut Si is (k + l)(s — 1 ) — e(H) — t(i), and the minimum capacity of an a-6-cut 
is (k + l)(s - f ) - e(H) - t(H) = m* as required. 

Thus, there is an integral flow / with value m* as announced above. We now 
select the edges to be added to H as follows: For every directed edge (i + , j~) of 
G, we add f{i + ,j~) copies of the edge {i,j}, which yields the multigraph H* . 
The number of added edges is to*, the value of the flow /, and the capacity 
constraints guarantee that all left and right degrees in H* are bounded by k+1. 
Moreover, the defect of H* is at most 2t(H). □ 

4 Concluding remarks 

We can cast the algorithm when k = sketched in the introduction into the 
framework of our generic algorithm. Namely, if we set s = 2 and in Step [5] we 
just compare the two elements in each group, then we obtain that algorithm. 
The main feature of our algorithm is that every restart only spoils one group. 
This allows us to keep the effect of lies local. 

In order to improve the upper bound of Theorem [T] by the method of this 
paper, we would need a sorting algorithm with thickness o(s). (Moreover, to 
make use of the sublinear thickness, we would need to choose s superlinear in k, 
and thus the sorting algorithm would be allowed to compare every element with 
only o(s) others.) The following proposition shows, however, that such a sorting 
algorithm does not exist. Thus, we need a different idea to improve Theorem [TJ 



9 



Proposition 1. Every (randomized) algorithm to sort an s-element set has 
thickness O(s) in expectation. 

Proof. By Yao's principle |10j . it is enough to show that every deterministic 
sorting algorithm A has expected thickness f2(s) for a random input. In our 
case, we assume that the unknown linear ordering of X is chosen uniformly at 
random among all the s! possibilities. 

In each step, the algorithm A compares some two elements x, y £ X. Let us 
say that an element x G X is virgin at the beginning of some step if it hasn't 
been involved in any previous comparison, and elements that are not virgin are 
tainted. A comparison is fresh if it involves at least one virgin element. 

For notational convenience, we assume that s is divisible by 8. Let L C X 
consist of the first s /2 elements in the (random) input order (which is also the 
order of the output of the algorithm), and let R := X \ L. Let Ei be the event 
that the ith fresh comparison is an LR- comparison, i.e., a comparison in which 
one of the two compared elements x, y lies in L and the other in R. We claim 
that for each i = 1,2,..., s/8, the probability of Ei is at least |. 

To this end, let us fix (arbitrarily) the outcomes of all comparisons made by 
A before the ith fresh comparison, which determines the set of tainted elements, 
and let us also fix the positions of the tainted elements in the input ordering. 
We now consider the probability of Ei conditioned on these choices. The key 
observation is that the virgin elements in the input ordering arc still randomly 
distributed among the remaining positions (those not occupied by the tainted 
elements) . 

Let £ be the number of virgin elements in L and r the number of virgin 
elements in R; we have s/4 < £, r < s/2. 

We distinguish two cases. First, let only one of the elements x, y compared 
in the ith fresh comparison be virgin. Say that x is tainted and lies in L. Then 
the probability of Ei equals r/(l + r) > |. 

Second, let both of x and y be virgin. Then the probability of Ei is 2£r/((£+ 
r)(£ + r — 1)), and since s/4 < £, r < s/2, this probability exceeds |. 

Thus, the probability of Ei conditioned on every choice of the outcomes of 
the initial comparisons and positions of the tainted elements is at least -|, and 
so the probability of Ei for a random input is at least ^ as claimed. Thus, the 
expected number of Li?-comparisons made by A is f2(s). 

Let a be the largest element of L, i.e., the (s/2)th element of X, and let b 
be the smallest element of R, i.e., the (s/2 + l)st element of X. Since we may 
assume that A doesn't repeat any comparison, there is at most one comparison 
of a with b. Every other Li?-comparison compares elements that have a or b (or 
both) between them. Thus, the expected thickness of A is at least half of the 
expected number of Li?-comparisons, which is O(s). □ 

Note that the only thing which we needed in the proposition above was that 
the corresponding ordered graph is simple and has minimum degree at least 1. 
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